A New Model for Arabic Text Clustering by Word Embedding and Arabic Word Net
نویسندگان
چکیده
منابع مشابه
Word sense disambiguation for arabic text categorization
In this paper, we present two contributions for Arabic Word Sense Disambiguation. In the first one, we propose to use both two external resources AWN and WN based on Term to Term Machine Translation System (MTS). The second contribution relates to the disambiguation strategies, it consists of choosing the nearest concept for the ambiguous terms, based on more relationships with different concep...
متن کاملa new approach toward word selection in arabic language
word formation and word selection in one hand’ are considered from higher processes a word aqualization which has a wide range application for language theorists and researchers in order to enrich language or in a better way in order to set free them from words. inadequacy challenge and also lach of the knowledge of the words in technology and industry fields aspecially in arabic and persian sp...
متن کاملLanguage Model Based Arabic Word Segmentation
We approximate Arabic’s rich morphology by a model that a word consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme). Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus. The algorithm uses ...
متن کاملTranslating Dialectal Arabic as Low Resource Language using Word Embedding
A number of machine translation methods have been proposed in recent years to deal with the increasingly important problem of automatic translation between texts of different languages or languages and their dialects. These methods have produced promising results when applied to some of the widely studied languages. Existing translation methods are mainly implemented using rule-based and static...
متن کاملWord Extraction and Recognition in Arabic Handwritten Text
Segmenting arabic manuscripts into text-lines and words is an important step to make recognition systems more efficient and accurate. The major problem making this task crucial is the word extraction process: first, words are often a succession of sub-words where the space value between these sub-words do not respect any rules. Second, the presence of connections even between non adjacent sub-w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Saudi Journal of Engineering and Technology
سال: 2019
ISSN: 2415-6272,2415-6264
DOI: 10.36348/sjeat.2019.v04i10.001